Learn more about JAX LondonSTAY TUNED!
This article introduces Generational ZGC (JEP-439), Key Encapsulation Mechanism API (JEP-452), and Code Snippets in Java Doc (JEP-413). The first is a small adaptation to the new Z Garbage Collector, which promises a huge performance boost on any size heap. The second is about a new API with the key for an especially secure symmetric encryption that can be transmitted with public/private key methods. The last is about the inclusion of code examples in the Java API documentation.
JEP-439: Generational ZGC
An instance on the heap can be removed when it’s no longer needed. In hardware-oriented languages, developers are responsible for determining the correct, safe time. The right time is when no other instance has a reference to the instance stored any longer. Call-by-reference calls to methods can propagate these references widely through the system, making the analysis complex. Manually determining when it’s safe to remove them is error-prone. That’s why the JVM, like many modern ecosystems, relies on the concept of a garbage collector, or GC for short. A garbage collector automatically monitors the memory and removes instances at the safest time.
Hidden Hero
The author derives Hidden Heroes from the term “Hidden Champions”, which was coined by Hermann Simon. Hidden champions refer to small companies that are world market leaders but have little public awareness. Similarly, hidden Java heroes are features that have a great impact on the JDK environment, but receive too little attention from the community.
In the Java Virtual Machine, there’s been several differently implemented garbage collectors with different focuses for a long time now. They all share the GC pause, where the application must be paused:
- Serial GC is the simplest GC. It uses only one thread and pauses the application while it runs. It’s suitable for single-core systems but not for hardware with multiple cores. There is no benefit from multiple computational cores.
- Parallel GC behaves like Serial GC but uses multiple cores, making it a better alternative for applications on multicore hardware. As with Serial GC, the GC pause length is primarily dependent on the heap memory’s size.
- G1 GC is actually called Garbage First GC [G1-GC] and applies partitions on the heap. Partitions are prioritized and analyzed in ascending order according to free memory. G1 GC analyzes and removes instances until the configured fixed length of GC pause has elapsed. Often, not all partitions are processed completely. G1’s goal is to have as fixed a pause time as possible. It’s also well suited for machines with many processors and large heaps.
The Z Garbage Collector [Z-GC] is relatively new. It was introduced in Java 15, called Z GC for short. The Z GC performs labor-intensive analysis of the heap in parallel with executing the application in separate threads. This means that the application only has to be interrupted for a short time to synchronize threads. The GC pause is 1ms long and independent of the analyzed heap size. Parallelism is achieved with the help of colored pointers and load barriers. Colored pointers are a pointer decoration, a memory address, with meta information about the state of the instance. Load barriers are implemented for access to references. They evaluate the colored pointer and perform potentially necessary redirects before the application accesses the instance to disguise address changes. Z GC’s highly optimized algorithm is described in detail in the article Deep-dive of ZGC’s Architecture [ZGC-ARCH] on Dev.java. It can efficiently handle very small and also very large heaps up to 16TB. Z GC can be activated by the command line parameter -XX:+UseZGC. When using Z GC for the first time, it’s recommended that you also enable GC logging (-Xlog:gc) to allow fine tuning of the configuration. Besides the number of threads used by Z GC (-XX:ConcGCThreads) and some Linux specific parameters, returning memory to the operating system can also be enabled (-XX:+ZUncommit).
With JEP-439: Generational Z GC [JEP-439], Z GC is extended by the partitioning of the heap into generations – one area each for young and old. The area for the young instances is still divided into Eden and Survivor. Newly created instances are usually created in Eden and copied to Survivor if they “survive” the first GC run. After they’ve “survived” a fixed number of GC runs in Survivor, instances are copied to the Old Instances section. Partitioning enables the Weak Generational Hypothesis [WEAKGEN]. In essence, this states that “Young instances have a tendency to die young”. Consequently, among instances in the Young domain, and especially among those in the Eden domain, you can assume they are no longer referenced and should be removed. The Old domain may contain instances that aren’t likely to be removed and are not analyzed frequently. By treating young and old separately, the average analysis effort is reduced and the GC process becomes more efficient. Partitioning can be activated with the parameter -XX:+ZGenerational. In the future, Z GC should only use the approach with generations. Then the parameter is no longer necessary.
With its efficient handling of huge heaps, low GC pause, and focus on removing boy instances, Generational Z GC is optimized for data-intensive applications that require a short response time. Generational Z GC is a good choice for modern data-driven systems in enterprise environments. Due to the high complexity and a lot of necessary theory in this area, Generational Z GC lacks a bit of attention. That’s why Generational ZGC is a hidden hero in the Java ecosystem.
JEP-452: Key Encapsulation Mechanism API
Encryption is used in many places in electronic communication. Many current methods aren’t considered secure in the age of quantum computers. For example, Mallroy might be researching a quantum computer and recently had a major breakthrough. Alice and Bob want to organize a surprise party for their friend Mallroy. To surprise Mallroy, Alice and Bob want to communicate only through secure, encrypted channels. They want to exchange many messages but Mallroy might already have the quantum computer available, so only an efficient, highly secure method is possible.
During their research, they learn that there are symmetric and asymmetric methods. In an asymmetric encryption method, two different keys are used for encryption and decryption. The key for encryption can be transmitted without danger, but unfortunately, these algorithms usually aren’t high-performance and aren’t very secure. Symmetric methods use the same key for encryption and decryption. They clearly score high in efficiency and security. Unfortunately, both sides need to know the key used, and since it is secret, sending it isn’t trivial.
In the article Design and Analysis of Practical Public-Key Encryption Schemes Secure against Adaptive Chosen Ciphertext Attack [KEM-ART] published by Cramer and Shoup in 2001, the mechanism Key Encapsulation is described in §7.1. Using this, you can transmit keys for symmetric encryption procedures securely with help from asymmetric encryption. In their post-quantum cryptography concepts [KEM-BSI, KEM-NIST], the BSI and NIST consider this procedure, known as the abbreviation KEM, a basic building block. Figure 1 shows the KEM flow between Alice and Bob. First, Alice generates a public and private key pair. The public key is transmitted to Bob. At this point, transport encryption between Alice and Bob is important, but it’s outside the scope of the procedure. Bob generates a random key for the symmetric encryption scheme used later and encrypts it with Alice’s public key. This data packet, known as encapsulated, is transmitted back to Alice and only she can decrypt it with her private key. From now on, Alice and Bob can exchange messages using symmetric encryption, and the key pair is no longer needed.
Fig. 1: Bob and Alice exchange the key for a symmetric encryption (yellow) with KEM. To do this, Bob uses Alice’s public key (green) to send the randomly generated key encrypted to Alice, who decrypts it with her private key (red). From this point on, Alice and Bob can communicate symmetrically encrypted.
Implementing security-related features and especially encryption is a highly complex field. But encryption is also a fundamental requirement for modern systems. JEP-452: Key Encapsulation Mechanism API [JEP-452] introduces an API for performing a KEM process in Java 21. The API follows the process defined by ISO 18033-2 and what we described above. In this process, three building blocks form the foundation for secure encryption. The javax.crypto.KEM.Encapsulator uses the public key and the generated symmetric key to create an encapsulated. The encapsulated is represented by an instance of javax.crypto.KEM.Encapsulated and can be transmitted directly as a byte array from Bob. On Alice’s side, the javax.crypto.KEM.Decapsulator class is used to obtain the symmetric key with the matching private key.
In the following example, the sendToBob/sendToAlice and retrieveFromBob/ retrieveFromAlice methods map the sending and receiving ends of the insecure communication channels between Alice and Bob. Listing 1 shows how the setup of the KEM process is implemented for Alice using the KeyPairGenerator. The KeyPairGenerator#getInstance(String) method generates a key pair generator for the RSA process. You can store additional algorithms using a service provider interface. A pair of public and private keys is generated by calling KeyParGenerator#generateKeyPair(). This pair is stored with Alice and the public key is sent to Bob.
Listing 1 var keyGen = KeyPairGenerator.getInstance("RSA"); var pair = keyGen.generateKeyPair(); sendToBob(pair.getPublic());
Listing 2 shows the first key exchange phase on Bob’s side. After the public key is received from Alice, an implementation configured for RSA is selected using KEM#getInstance(String). New implementations can also be provided by an SPI. On the specific KEM-RSA instance, an encapsulator instance for RSA-KEM is created using the method KEM#newEncapsulator(java.security.PublicKey). The randomly generated key should be retrieved with Encapsulated#key() and stored with Bob (in the variable secret). This key is used to perform the symmetric encryption later. The Encapsulate that will be sent is accessible with the Encapsulated#encapsulate() method and is sent from Bob to Alice.
Listing 2 var publicKey = retrieveFromAlice(); var encapsulator = KEM.getInstance("RSA-KEM").newEncapsulator(publicKey); var secret = encapsulate.key(); var encapsulate = encapsulator.encapsulate(); sendToAlice(encapsulate.encapsulation());
Listing 3 shows the second phase of key encapsulation from Alice. After Bob receives the encapsulate and the private key has been taken from the setup key pair, unpacking the symmetric key can begin. First, the KEM#getInstance(String) method is used again to load the concrete implementation of KEM for RSA-KEM. With the call KEM#newDecapsulator(PrivateKey), the symmetric key is decapsulated with help from the private key.
Listing 3 byte[] encapsulate = retrieveFromBob(); var privateKey = pair.getPrivate(); var decapsulator = KEM.getInstance("RSA-KEM").newDecapsulator(privateKey); var secret = decapsulator.decapsulate(encapsulate);
From now on, the public and private key pair isn’t needed and can be deleted. Communication can now be done securely and efficiently with a symmetric encryption scheme. In this new Java 21 feature, the OpenJDK community upgraded the Java ecosystem for the post-quantum era. You can now use Java in a world full of quantum computers. For this reason, this feature should get more attention. It’s definitely a Java 21 hidden hero.
JEP 413: Code Snippets in Java API Documentation
Developers have their own interface requirements. Usability and comprehensible documentation are usually the most important aspects. Because of this, Javadoc has been a part of the JDK as a source code documentation tool since Java 1. With Javadoc, comments can be enriched using descriptive tags and transformed into a navigable, searchable interface documentation. Besides parameter and behavior descriptions, it’s useful to document intended usage in order to reduce hurdles. A few different approaches have been established.
- Separate tutorials such as the reference documentation from Spring [Spring-Data-JPA] or extensions guides from Quarkus [Quarkus-RESTEasy]. The focus here is more on using the framework and less on the API.
- Code snippets as HTML with <pre>{@code } </pre> focus on the API. Unfortunately, the snippet isn’t nicely formatted and must be kept in mind whenever the interface is changed. An example is java.util.stream.Stream [stream javadoc, stream code].
- Considering unit tests as documentation is a good approach, provided that the tests are good and the developers using them have access to the code. Unfortunately, this approach often lacks the link between code and tests.
Since Java 18, JEP 413: Code Snippets in Java API Documentation [JEP413] provides the possibility to combine source code snippets with syntax highlighting and testability. This allows developers to include good examples in the API documentation. The new {@snippet: … } tag is used to include a snippet in a Javadoc comment.
Listing 4
/**
* Calculation of VAT for a private customer on purchase in the amount of 1055
* {@snippet :
* var customer= new Privaecustomer("Merlin", "");
* var wert = 1055d;
* // ...
* var mwst = MwStCalculator.PlainOOP.calculateMwSt(customer, value);
* }
*/
Listing 4 shows an inline snippet of the expected interaction with the VAT Calculator API from my [BoegDOP] Data Oriented Programming sample project. In the generated documentation, the area from the new line after the colon to the last line before the closing parenthesis is included as formatted source code with syntax highlighting. There are two constraints for inline snippets:
- A multiline comment with /* */ is not allowed.
- For each open parenthesis, there must be a closing one too
Without these constraints, it’s not possible for the generator to correctly convert the passage. In addition, syntactical correctness must be checked manually and interface changes must be taken into account. None of these constraints apply for an external snippet. For an external snippet, the content is not specified in the same comment, but is taken from an existing Java file.
Listing 5 /** *Calculation of VAT for a private customer on purchase in the amount of 1055 * {@snippet file="SwitchExpressionsSnippets.java" region="example"} */ // Datei: snippet-files/Snippets.java class Snippets { public void snippet01() { void snippet01() { // @start region="example" // @replace region="replace" regex='(= new.*;)|(= [0-9]*d;)' replacement="= ..." // @highlight region="highlight" regex="\bMwStRechner\b" var customer = new Privatecustomer("Merlin", "[email protected]"); // @replace regex="var" replacement="Customer" var value = 1055d; // @replace regex="var" replacement="double" /* .. */ var mwst = MwStCalculator.PlainOOP.calculateMwSt(customer, value); // @link substring="PlainOOP.calculateMwSt" target="MwStCalculator.PlainOOP#calculateMwSt" // @end @end @end } }
Listing 5 uses an external snippet that points to a Snippets.java file (also Listing 5) in the snippet-files folder. This folder is located right next to the file where the snippet is included and can be overwritten using the -snippet-path configuration. Configuring the path to the folder containing the test seems like a good default. This makes it possible to reuse the written tests as examples. In Listing 2, a region is also defined. This means that only certain areas of the referenced Java file are used and the file can contain examples for different use cases.
Besides ranges, other adjustments were made in the snippet in Listing 5. With the @replace tags, all initializations are first replaced with “…” per regular expression, since they don’t directly contribute to the example. The keyword var is replaced by the data type in the corresponding lines. No region is specified here, so the replacement is only applied to this line. The @hightlight tag is used to highlight each occurrence of “VATCalculator” and @link is used to create a link to the VATCalculator.PlainOOP#calculateMwSt method. Figure 2 shows the result of Listing 2’s Javadoc generation.
Fig. 2: Documentation generated from Listing X+1 with replacements, links, and highlights
The ability to link source code and API documentation ensures that listings in Javadoc are up-to-date and high quality. The API documentation’s quality, timeliness, and comprehensibility improves. If all tags are used in a snippet, reading the underlying code is more difficult. This primarily affects framework and tool developers but everyone benefits from the extra effort. This is why Code Snippets in Java API Documentation is a hidden Java 21 hero.
Summary
This article showed three of the many hidden heroes in the JDK ecosystem, aside from Virtual Threads and Pattern Matching. There are a few more like Class Data Sharing and the Simple Web Server, but that’s material for another post, conferences, user groups, or self-study.
References
[Parallel-GC]: https://docs.oracle.com/en/java/javase/20/gctuning/parallel-collector1.html
[G1-GC]: https://docs.oracle.com/en/java/javase/20/gctuning/garbage-first-garbage-collector-tuning.html
[Z-GC]: https://docs.oracle.com/en/java/javase/20/gctuning/z-garbage-collector.html
[ZGC-ARCH]: https://dev.java/learn/jvm/tool/garbage-collection/zgc-deepdive/
[JEP-439]: https://openjdk.org/jeps/439
[WEAKGEN]: https://docs.oracle.com/en/java/javase/17/gctuning/garbage-collector-implementation.html
[KEM-ART]: Design and Analysis of Practical Public-Key Encryption Schemes Secure against Adaptive Chosen Ciphertext Attack, Crammer and Shoup 2001, https://eprint.iacr.org/2001/108.pdf
[KEM-NIST]: https://csrc.nist.gov/News/2022/pqc-candidates-to-be-standardized-and-round-4
[JEP-452]: https://openjdk.org/jeps/452
[Spring-Data-JPA]: https://docs.spring.io/spring-data/jpa/docs/current/reference/html/
[Quarkus-RESTEasy]: https://quarkus.io/guides/resteasy-reactive
[Stream-Javadoc]: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/Stream.html
[JEP413]: https://openjdk.org/jeps/413
[BoegDOP]: https://github.com/MBoegers/DataOrientedJava